Week 2: Economic Fundamentals and Regression-Based Prediction

In this blog, I will attempt to forecast the outcome of the 2024 US presidential election. As a part of Gov 1347: Election Analytics, I will use data from a variety of sources to develop a compelling model.

(SHORTEN SECTION) This week, I will explore the predictive power of economic fundamentals at the national and state level on national popular vote percentage for the incumbent party’s candidate. To do so, I will evaluate predictive models for a variety of national economic measures, including implementing cross-validation, and compare this to models based on state-level measures to determine the impact of sociotropic versus rational (CHECK) voting.

(ADD ECONOMIC FUNDAMENTALS LITERATURE NOTES, RETROSPECTIVE)

Economic fundamentals, whose values have been tracked by agencies including the St. Louis Federal Reserve (FRED) and Department of Commerce Bureau of Economic Analysis (BEA) going back to the early twentieth century, can provide insight on why people vote and how to forecast the result of the 2024 presidential election. Today’s analysis will focus on the national popular vote percentage for the incumbent party as the chosen dependent variable. Incumbents are in charge of the economic conditions in the United States leading up to the election, and as such the conditions have been shown to directly impact their vote results (LINK). The use of past data or experience by voters to inform their decisions about a particular candidate is known as retrospective voting. The applicability of the retrospective voting hypothesis will be explored below (LITERATURE).

Assumptions and Decisions

(SHORTEN SECTION) Before beginning this analysis, I would like to note decisions I have made in restricting my data. Due to concerns about the potential applicability of older data to today’s voters, I have decided for this week to focus on elections 1952-2016. This may be too wide of a range due to changes in who can vote, with laws like the Voting Rights Act of 1965 changing the electorate, and how people vote, with indications the the economy may not be as prominent a factor in retrospective voting decisions as it once may have been (LITERATURE). In the future, I plan to weigh older elections using to be determined amounts in my models. I have also decided to exclude 2020, as the pandemic created outlier economic results which could skew the predictions. The 2020 results may effect the 2024 outcome in other uncontrolled ways, creating more bias in my model. This decision will need to be reevaluated in later iterations of my model.

National Economic Predictors

economic model of voting behavior (add literature)

National economic variables and their relationship to popular vote outcomes can define the economic model of voting behavior. Looking specifically at quarter two results in election years across of variety of government measured variables– quarter two due to the retrospective model of voting noting that recent events have more impact on voting decisions (LIT)– I will determine which predictors provide worthwhile insight and could be used to predict 2024 results. The variables I will examine include

GDP: gross domestic product in billions GDP Growth Q2: quarterly GDP growth in Q2 RDPI: real disposable personal income in dollars RDPI Growth Q2: quarterly RDPI growth in Q2 CPI: consumer price index in dollars Unemployment: unemployment rate as a percent SP500 close, open, high, low, adjusted close, and volume: SP500 values in dollars DPI: disposable personal income

Per (LIT), I will begin by examining bivariate regression models using GDP growth and RDPI growth quarterly.

Analyzing the fit of these two bivariate regression models, it is important to note that neither is particularly strong objectively, as evidenced by the in sample fit. Using RDPI growth only accounts for 11.15% of the variance in incumbent popular vote share(\(R^2\) metric). Quarter two GDP growth does slightly better, explaining for 32.48% of the variation. While, as I will show below, these are the top two predictors in terms of in-sample fit of the national economic variables described, their overall performance does not indicate strong predictive power.

Additionally, looking at the correlation between these predictors and the popular vote outcomes 1952-2016, both have a moderately strong positive correlation (0.569918 for GDP growth and 0.3338966 for RDPI growth). However, correlation does not imply causation, and as such there is little evidence for a direct causal relationship between either of these economic indicators and popular vote share on their own. There are likely additional omitted variables that I cannot control, as well as a multivariate relationship possible. It is also worth noting that these models do even worse when including 2020, an outlier year in economic metrics.

A summary of the relationships between these and other national economic variables used as predictors is shown below.

metric GDP GDP_growth_quarterly RDPI RDPI_growth_quarterly CPI unemployment sp500_open sp500_high sp500_low sp500_close sp500_adj_close sp500_volume dpi
r_squared 0.045 0.325 0.081 0.111 0.049 0.000 0.040 0.040 0.040 0.039 0.039 0.049 0.035
prediction_2024 47.839 51.585 48.429 50.326 48.897 51.973 44.365 44.307 44.216 44.309 44.309 50.735 49.793
prediction_2024_upper 63.235 61.310 62.212 61.756 62.517 64.112 67.418 67.346 67.608 67.562 67.562 62.641 62.955
prediction_2024_lower 32.443 41.860 34.645 38.896 35.278 39.833 21.313 21.269 20.825 21.056 21.056 38.830 36.631
mean_abs_error 2.074 1.835 2.395 2.107 2.081 2.217 2.019 1.983 2.010 2.063 2.127 2.124 2.092
rmse 5.004 4.207 4.858 4.827 4.993 5.120 5.018 5.016 5.018 5.019 5.019 4.992 5.031
correlation -0.212 0.570 -0.285 0.334 -0.222 0.007 -0.199 -0.201 -0.199 -0.198 -0.198 -0.222 -0.187
slope 0.000 0.737 0.000 0.460 -0.015 0.022 -0.002 -0.002 -0.002 -0.002 -0.002 0.000 0.000
intercept 53.072 49.375 54.987 49.865 53.604 51.885 52.861 52.869 52.863 52.859 52.859 52.673 54.276

Evaluating each of these models indicates that none are particularly strong in their ability to describe the incumbent popular vote data over time or significant in their ability to predict additional values for the 2024 election that indicate one winner over the other. While all 13 regressions predict Harris’ popular vote percentage as the candidate for the incumbent party will be 44.216% to 51.973%, it is worth noting that all of the 95% confidence intervals for these values contain 50%, leading to results that are not significant. Additional factors are also needed to understand how Harris as the technical incumbent but not current president may fare within these models, a variable we cannot accurately reflect with these bivariate cases.

While no model is particularly good, the best of the above is GDP growth for Q2 in the election year. It fares best in all methods of model evaluation, including in-sample fit with the highest \(R^2\), smallest root mean squared error, and smallest mean absolute error after cross validation for out of sample fit.

The 2024 prediction is also sensitive to the change in the predictive variable, pointing out that there may be some relationship between economic fundamentals and incumbent popular vote outcomes, as supported by the literature (LINK), but it may not be a direct bivariate one as the choice of predictor changes the model. This points to the important question of whether economic fundamentals have a direct or indirect effect on an individuals vote choice, as well as whether the national (sociotropic) or individual economic measures matter, an often explored phenomenon in the literature by scholars like Gregory B. Markus of the University of Michigan.

State Level Predictors

While sociotropic voting would point to national economic variables having a greater impact on the votes of the people and therefore the incumbent vote share, I will also investigate the potential impact of individual considerations. Some voters may take more into account person economic changes rather than national ones, and as such I will explore state level regressions with the predictor of unemployment rate to determine if this improves the predictive power compared to national data.

Visualizing the results of these 48 regressions above– excluding Hawaii, Alaska, and Washington DC due to limitations with data available– a similar variety of in sample fit values to that of the national economic fundamentals appears. While we are only looking at one predictor, which is specifically Q2 unemployment rate in each state in the election year 1976-2016, the predictive power correlationally is stronger than out best predictor on the national level in states like North Dakota and on the other hand near zero in states like Michigan.

Because state level GDP growth data is not readily available, there is no way to test whether that particular predictor does better in a sociotropic or individual setting. The unemployment rate, here with an average in sample fit of 0.132, does much better than the near 0 variance explained by the national unemployment rate, but still points to GDP growth as the best predictor in this case. This is counter to some literature (LINK), which notes the RDPI growth is a much better predictor of incumbent popular vote, but those are not the results found here.

It is worth noting that there are only twelve data points in each regression, most likely leading to over-fitting of these models. This is also a concern with the slightly larger group of national economic fundamentals data points and in general when creating models to predict election outcomes. The small number of applicable elections makes it difficult to fit a model that works well out of sample.

Individual Versus Sociotropic Voting Patterns

Given the results above, sociotropic voting, when people vote based on national economic conditions that effect others and not just themselves, is a more likely explanation in the retrospective economic voting model than is individual circumstance based voting. This is consistent with the importance of aggregate over individual conditions as found in (LINK).

It has also been shown in Achen and Bartels’ “Democracy for Realists: Why Elections Do Not Produce Responsive Government” that voters may be retrospective but focus on short terms before elections in order to evaluate their choices, consistent with our choice of Q2 election year metric. As scholars Lenz and Healy note, this focus on the economy only during election years may select for the best economic manipulators, not leaders, but never-the-less provides insight on how voters use economic conditions to make decisions.

It is also important to consider how predictors and trends in incumbent popular vote may have changed over the period we are examining. It could be that people used to behave retrospectively but perhaps don’t anymore because parties have moved in a way that people won’t flip between them anymore. Evidence from Dassoneville and Tien’s “Introduction to Forecasting the 2020 US Elections” also highlights that economic variables may be effected by shocks which throw off the predictions, one reason why I decided to remove 2020 from all of my regressions above.

Taken together, this exploration and analysis of the literature leads to me use national GDP growth in quarter two before the election as the predictor in my forecast for 2024 results, leading to a prediction of two way popular vote of

Current Forecast: Harris 51.585% - Trump 48.415%

with a large margin of error and lack of significant result.

Data Sources

  • Popular Vote by Candidate, 1948-2020
  • Popular Vote by State, 1948-2020
  • FRED Economic Data, c.1927-2024
  • BEA Economic Data, 1947-2024
  • FRED Unemployment by State, 1976-2024